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Data collected in Run II of the Fermilab Tevatron are searched for indications of new electroweak- 
scale physics. Rather than focusing on particular new physics scenarios, CDF data are analyzed for 
discrepancies with the standard model prediction. A niodcl-independent approach (ViSTA) considers 
gross features of the data, and is sensitive to new large cross-section physics. Further sensitivity 
to new physics is provided by two additional algorithms: a Bump Hunter searches invariant mass 
distributions for "bumps" that could indicate resonant production of new particles, and the Sleuth 
procedure scans for data excesses at large summed transverse momentum. This combined global 
search for new physics in 2.0 fb~^ of pp collisions at ■\/s = 1.96 TeV reveals no indication of physics 
beyond the standard model. 



The standard model (SM) of particle physics has been 
remarkably successful in describing observed phenomena, 
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but is generally believed to require expansion. Using data 
corresponding to an integrated luminosity of 2.0 fb^^ of 
pp collisions at y/s — 1.96 TeV collected by the CDF II 
detector at the Fermilab Tevatron, we present a broad 
search for physics beyond the standard model without 
focusing on any specific proposed scenario. A similar 
search has previously been performed by the CDF Col- 
laboration with 927 pb~^ of data [1]. 

Events containing one or more particles with large 
transverse momentum (pt) are analyzed for discrepan- 
cies relative to the SM prediction. A model-independent 
approach (Vista) considers gross features of the data, 
and is sensitive to new large cross-section physics. Fur- 
ther sensitivity to beyond-SM physics is provided by two 
additional algorithms: a Bump Hunter searches invariant 
mass distributions for "bumps" that could indicate reso- 
nant production of new particles, and the Sleuth proce- 
dure scans for data excesses at large summed transverse 
momentum. These global algorithms provide a comple- 
mentary approach to searches optimized for more specific 
new physics scenarios. 

CDF II [2j] is a general-purpose detector for high- 
energy pp collisions. Tracking for charged particles is 
provided by silicon strip detectors and a gas drift cham- 
ber inside a 1.4 T magnetic field. The tracking system 
is surrounded by electromagnetic and hadronic calorime- 
ters and enclosed by muon detectors. 

The Vista procedure is extensively described in [l[. 
A standard set of object identification criteria is used 
to identify isolated and energetic objects produced in the 
hard collision, including electrons (e^), muons (n^), taus 
(r^), photons (7), jets (j), jets originating from a bottom 
quark (6), and missing transverse momentum (j^t) 
All objects are required to have pr > 17 GeV/c. With 
all event selections applied, over 4 x 10^ high-px events 
are analyzed in this global search. The standard model 
prediction is based on Monte Carlo event generators and 
a simulation of the response of the CDF detector. Data 
and Monte Carlo events are partitioned into exclusive 
final states labeled according to the number and type of 
objects (e^, p^, t"^, 7, j, b, ^t) identified in each event. 

To obtain an accurate standard model prediction, a 
correction model is used to improve systematic deficien- 
cies in the Monte Carlo theoretical prediction and the 
simulation of the detector response - this information 
can only be obtained from the data themselves. The de- 
tails of this correction model are motivated by individual 
discrepancies noted in a global comparison of CDF high- 
Pt data to the SM prediction; however, the correction 
model is intentionally kept as simple as possible in or- 
der to avoid over-tuning. The correction model includes 
specific correction factors for the integrated luminosity 
of the sample, the ratios (fc-factors) of the actual cross 
sections for SM processes to the leading order approxi- 
mations given by event generators, object identification 
efhciencies, object misidentification rates, and trigger ef- 
ficiencies. Values for the correction factors are deter- 
mined from a global fit to the data: a global is formed 
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FIG. 1: Distribution of observed discrepancy between data 
and the SM prediction for populations of final states, mea- 
sured in units of standard deviation (a). The black line rep- 
resents the theoretical expectation assuming no new physics. 
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FIG. 2: Distribution of observed discrepancy between data 
and the SM prediction for shapes of kinematic distributions, 
measured in units of standard deviation (a). The black 
line represents the theoretical expectation assuming no new 
physics. 



by comparison to the SM prediction, and minimized as 
a function of the correction factors. External informa- 
tion (such as higher-order cross-section calculations) is 
used to constrain 26 of the 43 total correction factors. A 
number of minor improvements have been made to the 
correction model since [ij ; these changes are described in 
detail in [3]. 

The first stage in the ViSTA global comparison is to 
study the populations of the exclusive final states, com- 
pared to the SM expectation. Figure [1] summarizes the 
population discrepancies in all 399 final states, and the 
ten final states with the largest deviation from the SM 
expectation are listed in Table HI After accounting for 
the trials factor associated with considering many final 



5 



Final State 




Data 


Background 


a at 






690 


817.7 ± 9.2 
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-3.5 -1.3 


j2r± 




105 


150.8 ± 6.3 


-3.4 -1.2 



6000F 
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TABLE L The ten most discrepant ViSTA final states, show- 
ing the number of data events observed and the number of 
background events expected. Only Monte Carlo statistical 
uncertainties on the background prediction are included, a 
and at represent the level of discrepancy, before and after 
accounting for the trials factor. 



states, we find that no final state exhibits a statisticaUy 
significant population discrepancy. 

The Vista global comparison also considers the shapes 
of kinematic distributions. The Kolmogorov-Smirnov 
test is used to assess the agreement between data and the 
SM prediction for 19 650 distributions. The results are 
summarized in Fig. [2] which shows the degree of discrep- 
ancy measured for each distribution, displayed in units 
of standard deviation (a). Distributions exhibiting dis- 
agreement between data and SM prediction are at large 
positive a. 

We find that 555 distributions have a significant dis- 
crepancy, which is defined as being greater than 3a after 
accounting for the trials factor associated with the num- 
ber of distributions considered. The discrepant distribu- 
tions fall into three categories. Residual "crudeness" in 
the correction model, primarily from using simplified pt 
dependences for fake rate correction functions, accounts 
for 3% . Another 16% are attributed to an inadequate 
modeling of the transverse boost of the colliding system. 
The remaining 81% most likely arise from incorrect mod- 
eling of soft QCD parton showering. This is best exempli- 
fied by Fig. [3l which shows AR = \J IS.(\P- + Arf- between 
the second and third highest pT jets in the ViSTA 3-jet 
final state. This observation has been discussed in more 
detail in The nature of these shape discrepancies 
does not warrant treating any of them as indicative of 
potential new physics. 

A statistically significant local excess of data in an in- 
variant mass variable would be the most direct evidence 
of resonant production of a new particle. The Bump 
Hunter algorithm is designed to identify mass resonances 
with a narrow natural width that would appear as Gaus- 
sian bumps on top of the SM background, with width 
equal to the detector resolution. The Bump Hunter 
searches in all exclusive final states, and examines all 
mass variables that can be constructed from combina- 
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FIG. 3: One of the significa nt shape discrepancies seen by 
ViSTA, A_R = -J t^<\P- + At;^ between the second and third 
highest Pt jets in the ViSTA 3-jet final state. Data are shown 
as filled (black) circles, with the standard model prediction 
shown as the shaded (red) histogram. 



tions of the final state objects. If there is j>T^ transverse 
mass variables are also considered. The SM background 
is obtained from the ViSTA procedure. 

The method is described in detail in fs']. Each mass 
variable is scanned with a sliding window of width equal 
to twice the typical detector resolution for the compo- 
nent objects. Only windows that contain at least 5 data 
events are considered. The p-value for the window is de- 
fined as the Poisson probability that the expected SM 
background would fluctuate up to or above the number 
of data events observed. To ensure the window really 
represents a bump of the correct resolution-based width 
and not some broader excess, the "side-bands" of equal 
width on either side of the central window are required to 
be within 5 standard deviations of the SM expectation, 
and less discrepant than the central window. 

In each mass variable, the bump candidate with the 
smallest p-value is selected. The significance of this 
bump is given by P^, the fraction of pseudoexperiments 
which would have produced a more interesting bump in 
this mass variable purely by random fluctuations of the 
SM background. Pa incorporates the trials factor as- 
sociated with examining multiple overlapping windows 
within the mass variable. For computational reasons, it 
is prohibitive to determine Pa by pseudoexperiments for 
all mass variables, so instead an analytic approximation 
is used. If the analytic estimation returns a value of Pa 
with a signiflcance of > 4.5cr, then pseudoexperiments 
are performed for accurate determination. 

Each mass variable is further assigned a probability P^, 
deflned as the probability under the null hypothesis that 
any mass variable would appear more significant than 
this. Assuming no correlations, P\, = 1 — (1 — Pa) ^, 
where IS! is the total number of mass variables examined. 
If Pf, corresponds to a significance of > 3(t, that effect is 
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FIG. 4: Significance of the most interesting bump in each 
mass variable {Pa, in units of standard deviations) considered 
by the Bump Hunter. The black line represents the theoretical 
expectation assuming no new physics. 
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FIG. 5: Distribution of the invariant mass of all four jets in 
the 4j < 400 GeV final state. This variable contains the 

most significant bump found by the Bump Hunter, indicated 
by the dashed (blue) lines. 



then considered as potentially due to new physics. 

The Bump Hunter examines 5036 mass variables, of 
which 2316 are found to have at least one local excess 
satisfying our bump definition (the other variables are 
mainly from small-population final states which fail to 
satisfy the criterion of 5 data events in a mass window) . 
The expected and observed distributions of Pa, converted 
to units of a, are shown in Fig. [H The distribution of 
Pa in the data is seen to be shifted towards positive a 
relative to the expectation, indicating disagreement be- 
tween data and the SM prediction. This reflects the fact 
that the Bump Hunter algorithm is quite sensitive to lo- 
cal features in mass variables which can arise since the 
Monte Carlo-based SM background prediction does not 
perfectly describe the data. The sharp drop seen in the 
data at Pa — 4.5cr results from the transition between 
analytic estimation and accurate determination of Pa- 

The only mass variable with a bump which exceeds the 
discovery threshold is the invariant mass of all four jets 
in the Aj final state, shown in Fig. [5l This mass variable 
has Pa corresponding to 5.7a, and Pb to 4.1ct. However, 
this bump is attributed to the aforementioned difficulty 
modeling soft QCD jets and is not thought to indicate 
new physics. 

The final component of this global search for physics 
beyond the standard model is a procedure called 
Sleuth 0, Sleuth is a quasi-model-independent 

search technique, based on the assumption that new 
electroweak-scale physics will manifest itself as a high-p^ 
excess of data over the SM expectation in a particular 
final state. Tests have shown Sleuth to have sensitiv- 
ity comparable to targeted searches for phenomena that 
satisfy Sleuth's basic assumptions. 

The procedure is identical to that used in The al- 
gorithm considers a single variable, the summed scalar 
transverse momentum (^pt) of all objects in the event. 
The SM prediction for the distribution of '^pt is de- 
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FIG. 6: The distribution of V in the data, with one entry for 
each final state considered by Sleuth. The black line repre- 
sents the theoretical expectation assuming no new physics. 



termined as part of the ViSTA procedure. The exclusive 
final states examined by Sleuth are created by merging 
Vista final states according to certain rules described 
in [l|. For each final state. Sleuth determines the region 
(defined as an interval in extending from a data- 

point up to infinity) which has the smallest probability 
that the SM prediction would fluctuate up to or above 
the number of observed data events. The algorithm then 
finds V, the fraction of pseudoexperiments drawn from 
the SM ^ pt distribution which produce any region more 
interesting than the region found in the data. Sleuth 
selects the final state with the smallest value of V, and 
calculates the overall significance, V, which accounts for 
the number of final states considered. With an accurate 
correction model and in the absence of new physics, the 
distribution of V is uniform between zero and unity; in 
the presence of new physics, a small value of P is ex- 
pected. The threshold for pursuit of a possible discovery 
case is taken to be 7^ < 0.001. 
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The distribution of V for the final states considered by 
Sleuth in the data is shown in Fig. [S] The concavity of 
this distribution reflects the crudeness (i.e. under-tuning) 
of our correction model. A crude correction model results 
in more outliers than expected, which, when converted 
into values of V, produces excesses at the extremes of 
both low and high probability. 

The ^ pt distributions of the four most interesting fi- 
nal states found by Sleuth are shown in Fig. [71 These 
arc: e^fi^, e^^^jj'^T, e^/i^/^Tj and e^e^/x^j^T + 
/i^/i^e^/^y. It is intriguing to note that all four con- 
tain the rare signature of a same-sign electron- muon pair. 
Such a signature can arise in a number of ways. SM pro- 
cesses that produce real electrons and muons with the 
same charge include WZ production with leptonic de- 
cays, where one of the leptons is not reconstructed in 
the detector. There are also processes which produce 
real electrons in the forward region of the CDF II de- 
tector, where the reduced tracking coverage means the 
electron charge sign has a higher probability of being 
falsely reconstructed; such processes include tt produc- 
tion, and Z T^T~ where both taus decay leptonically. 
In addition, there are processes with a real muon and a 
fake electron. These are largely W/Z+jets production, 
where a primary quark or gluon jet is misidentified as an 
electron in the detector, and Wj/Zj, where the photon 
undergoes conversion to produce an electron. Also rel- 
evant is the case when both the electron and the muon 
are fakes, predominantly from dijet events. The rela- 
tive proportion of these potential backgrounds varies for 
each final state, depending on the presence of and the 
number of jets. Since all of these processes and detector 
effects also contribute to other more highly-populated fi- 
nal states where good agreement is seen, their rates are 
quite well constrained by this global analysis. 

However, while it is noteworthy that the top four fi- 
nal states all contain the same rare signature, this is an a 
posteriori observation and its significance is therefore dif- 
ficult to estimate. Sleuth's a priori procedure is to cal- 
culate the significance of only the single most discrepant 
final state. We find that V = 0.08, i.e. that 8% of pseudo- 
experiments drawn from the ViSTA SM implementation 
would have produced a more significant excess in a sin- 



gle final state purely by chance fiuctuations. This is far 
from the threshold of 7-" < 0.001, and therefore we do not 
pursue this as a potential discovery. 

In summary, CDF has performed a model-independent 
global search for new high-pT physics in 2.0 fb^^of pp 
collisions at ^/s = 1.96 TeV. The populations of 399 ex- 
clusive final states are compared to a standard model 
prediction, but no significant discrepancy is found after 
accounting for the trials factor associated with looking 
in many places. The shapes of 19 650 kinematic distri- 
butions are also studied, and although 555 show a signif- 
icant discrepancy, most of these are attributed to inad- 
equate modeling of soft QCD jet emission in the under- 
lying Monte Carlo prediction, rather than a sign of new 
physics. A Bump Hunter algorithm scans invariant mass 
distributions for narrow bumps that could indicate res- 
onant production of new particles: only one significant 
bump is found, and it is attributed to the same underly- 
ing problem as above. The Sleuth algorithm searches 
the '^pt spectrum of each final state, but finds no sig- 
nificant excesses of data over the SM prediction in the 
tails of any single distribution. This CDF global search 
has not discovered new physics in 2.0 fb~^. 
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FIG. 7: The X^pr distributions of the four most interesting final states found by Sleuth. Data are shown as filled (black) 
circles, with the expected contribution from standard model processes shown as the shaded (red) histograms and identified in 
the legend. The category "Other" represents the sum of all remaining relevant SM processes, each of which individually is 
smaller than the smallest itemized contribution. The label in the top left corner of each plot lists the objects in the final state, 
where is a lepton (e or ^), /'is an additional lepton of different flavor, j denotes a jet, and j/t represents missing transverse 
momentum. Global charge conjugation is implied, so that a final state labeled l'^ also includes The region with the 

most significant excess of data over the SM expectation is indicated by the arrow below the a;-axis, and displayed in the inset 
with the number of events expected (SM) and observed (d). The significance of the excess is shown by the value of V in the 
top right corner. 



